Risk-Sensitive and Mean Variance Optimality in Markov Decision Processes
نویسندگان
چکیده
In this note, we compare two approaches for handling risk-variability features arising in discrete-time Markov decision processes: models with exponential utility functions and mean variance optimality models. Computational approaches for finding optimal decision with respect to the optimality criteria mentioned above are presented and analytical results showing connections between the above optimality criteria are discussed.
منابع مشابه
Risk-Sensitive and Average Optimality in Markov Decision Processes
Abstract. This contribution is devoted to the risk-sensitive optimality criteria in finite state Markov Decision Processes. At first, we rederive necessary and sufficient conditions for average optimality of (classical) risk-neutral unichain models. This approach is then extended to the risk-sensitive case, i.e., when expectation of the stream of one-stage costs (or rewards) generated by a Mark...
متن کاملCumulative Optimality in Risk-Sensitive and Risk-Neutral Markov Reward Chains
This contribution is devoted to risk-sensitive and risk-neutral optimality in Markov decision chains. Since the traditional optimality criteria (e.g. discounted or average rewards) cannot reflect the variability-risk features of the problem, and using the mean variance selection rules that stem from the classical work of Markowitz present some technical difficulties, we are interested in expect...
متن کاملSensitive Discount Optimality via Nested Linear Programs for Ergodic Markov Decision Processes
In this paper we discuss the sensitive discount opti-mality for Markov decision processes. The n-discount optimality is a reened selective criterion, that is a generalization of the average optimality and the bias optimality. Our approach is based on the system of nested linear programs. In the last section we provide an algorithm for the computation of the Blackwell optimal policy. The n-disco...
متن کاملSemi-markov Decision Processes
Considered are infinite horizon semi-Markov decision processes (SMDPs) with finite state and action spaces. Total expected discounted reward and long-run average expected reward optimality criteria are reviewed. Solution methodology for each criterion is given, constraints and variance sensitivity are also discussed.
متن کاملSecond Order Optimality in Transient and Discounted Markov Decision Chains
Abstract. The article is devoted to second order optimality in Markov decision processes. Attention is primarily focused on the reward variance for discounted models and undiscounted transient models (i.e. where the spectral radius of the transition probability matrix is less then unity). Considering the second order optimality criteria means that in the class of policies maximizing (or minimiz...
متن کامل